rank | frequency | n-gram |
---|---|---|
1 | 6286 | -n |
2 | 4766 | -a |
3 | 2608 | -i |
4 | 1725 | -s |
5 | 1626 | -r |
rank | frequency | n-gram |
---|---|---|
1 | 4875 | -an |
2 | 1579 | -ya |
3 | 1351 | -ng |
4 | 900 | -ah |
5 | 696 | -at |
rank | frequency | n-gram |
---|---|---|
1 | 1438 | -kan |
2 | 1388 | -nya |
3 | 649 | -ang |
4 | 394 | -gan |
5 | 359 | -ran |
rank | frequency | n-gram |
---|---|---|
1 | 422 | -nnya |
2 | 361 | -ngan |
3 | 249 | -akan |
4 | 204 | -anya |
5 | 200 | -ikan |
rank | frequency | n-gram |
---|---|---|
1 | 392 | -annya |
2 | 208 | -angan |
3 | 117 | -ngkan |
4 | 83 | -arkan |
5 | 75 | -atkan |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings